Combinatorial characterizations and impossibilities for higher-order homophily

Homophily is the seemingly ubiquitous tendency for people to connect and interact with other individuals who are similar to them. This is a well-documented principle and is fundamental for how society organizes. Although many social interactions occur in groups, homophily has traditionally been measured using a graph model, which only accounts for pairwise interactions involving two individuals. Here, we develop a framework using hypergraphs to quantify homophily from group interactions. This reveals natural patterns of group homophily that appear with gender in scientific collaboration and political affiliation in legislative bill cosponsorship and also reveals distinctive gender distributions in group photographs, all of which cannot be fully captured by pairwise measures. At the same time, we show that seemingly natural ways to define group homophily are combinatorially impossible. This reveals important pitfalls to avoid when defining and interpreting notions of group homophily, as higher-order homophily patterns are governed by combinatorial constraints that are independent of human behavior but are easily overlooked.


Supplementary Material for
Combinatorial Characterizations and Impossibilities for Higher-order Homophily Nate Veldt, ⇤ Austin R. Benson In this section we provide full details for our theoretical results regarding hypergraph affinity scores. We begin by reviewing and covering additional necessary terminology and notation in Section 1.1. We then show how to interpret affinity scores as maximum likelihood estimates for a certain affinity parameter of a binomial model for degree data (Section 1.3), cover additional background on baseline scores (Section 1.4), and then prove our main theoretical impossibility results for hypergraph homophily (Section 2). In Section 3, we show how these results can be adapted to an alternative notion of affinity scores.

Notation and Terminology
Consider a hypergraph H = (V, E) where V is a set of n = |V | nodes and E is a set of m = |E| hyperedges. We assume throughout that H is k-uniform (where k is constant) and non-degenerate, meaning that all hyperedges are of a fixed size k, and a node can appear at most once in a hyperedge. We also assume that nodes are organized into one of two classes A ✓ V and B ✓ V where A [ B = V and A \ B = ;.
For class X 2 {A, B} and integer t 2 [k] = {1, 2, . . . k}, we say that a hyperedge e 2 E is of type-(X, t) if exactly t nodes in e are from class X. Let m t (X) denote the number of type-(X, t) hyperedges in E. Since there are exactly two node classes, m t (A) = m k t (B) and m t (B) = m k t (A). It is often convenient to refer to hyperedge types in an absolute sense, without specifying class. We say that a hyperedge is of absolute type-t if exactly t of its nodes are from class A, and denote the number of such edges by m t = m t (A) = number of hyperedges of absolute type-t in E. (1) The degree of a node v 2 V , denoted d(v), is the number of hyperedges it participates in. For an integer t 2 [k], let d t (v) denote the number of hyperedges v participates in where exactly t nodes are from v's class, including v itself. We refer to this as the type-t degree of v. Summing typed-degrees produces the degree of a node: The type-t affinity score for a class X 2 {A, B} measures the propensity for nodes in this class to participate in type-(X, t) hyperedges. This score can be expressed in terms of sums of node degrees: This directly generalizes the homophily index of a graph [16], which is defined similarly in terms of typed degrees, and corresponds to the case where k = t = 2. Affinity scores can also be expressed in terms of hyperedge counts. The sum of type-t degrees for a class X satisfies X The value m t (X) is scaled by a factor t to account for the fact that each type-(X, t) hyperedge affects the degree of t different nodes from class X. We can express type-t affinity scores for both classes in terms of absolute hyperedge types as follows:

Cardinality-Based Hypergraph Stochastic Block Model
In order to provide a statistical interpretation for hypergraph affinity scores, we define a simple new generative model for hypergraphs. For this model, consider a set of nodes V separated into two classes A and B.
We say a tuple of k distinct nodes in V is a type-t k-tuple if exactly t of nodes in the tuple are from class A. For each t 2 {0, 1, . . . , k}, define a probability p t 2 [0, 1]. We emphasize the fact that these probabilities are defined with respect to a fixed class A; since it is possible to have hyperedges where all nodes are from class B, this also includes a probability p 0 . We define the cardinality-based hypergraph stochastic block model (cardinality-based HSBM) as follows: for each type-t k-tuple of nodes T = (v 1 , v 2 , . . . , v k ), we generate a hyperedge on T with probability p t . We denote the distribution of cardinality-based hypergraph stochastic block models with size k hyperedges by H(n, k, A, B, p) where p = [p 0 , p 1 , . . . , p k ] is a vector of hyperedge probabilities. This a special case of the general k-uniform hypergraph stochastic block model [30], which may involve more than two ground truth clusters or classes.

Affinity Scores as Maximum Likelihood Estimates
We now show how affinity scores for class A can be derived as maximum likelihood estimates for an affinity parameter of a certain binomial distribution. The same approach could also be applied to class B.

Type-degree Random Variables
Assume we are given a k-uniform hypergraph from the cardinality-based HSBM H(n, k, A, B, p), where the probability vector p = [p 0 , p 1 , . . . , p k ] is given up front and fixed. For a node a 2 A, let T j be the total number of type-j k-tuples of nodes that a belongs to: This value is the same regardless of which a 2 A we consider, and represents the maximum number of type-j hyperedges that a could belong to in an n-node hypergraph with node classes A and B. The type-t degree of each node a 2 A conditioned on probability p t will be a binomial random variable D t (a) ⇠ Binom (T t , p t ) .
We also define a random variable D(a) for the total degree of a 2 A by Finally, define another random variable for measuring the contribution to D(a) made by hyperedges that are not of type-t:D For any fixed t, D t (a) +D t (a) = D(a). The degree random variables D(a) will not be independent for different a 2 V , since the degrees must define a valid degree sequence for a k-uniform hypergraph. However, we can prove that they will be approximately independent by adapting existing techniques for graphs [31].
Supplemental Lemma 1. Let H ⇠ H(n, k, A, B, p) be a cardinality-based HSBM with hyperedge parameters satisfyingp = max i p i = O 1 n k 1 . If`> k is a fixed constant, the degree random variables for any set of`nodes, D(1), D(2), . . . , D(`), are asymptotically independent.
Proof. Let K denote the set of k-tuples of nodes in H, and let L denote the set of`nodes we are considering, which we denote by {1, 2, . . . ,`} without loss of generality. Each e 2 K is associated with a Bernoulli random variable X e such that X e = 1 if an edge is placed at k-tuple e. Each random variable D(i) for i 2 [`] is a sum of Bernoulli random variables: For i, j 2 [`], D(i) and D(j) are not independent, as X e appears in both of their sums whenever i 2 e and j 2 e. If e is a k-tuple of nodes from L, then X e contributes to the sum of all of the random variables (D i ) i2L . In general for an arbitrary e 2 K, the variable X e shows up in |e \ L| of these degree variables. Note that for each s 2 {2, 3, . . . k}, there are s n k s distinct k-tuples involving s nodes from L and k s nodes from V L.
In order to prove that random variables (D i ) i2L are approximately independent, for each i 2 L we construct a new random variableD(i) in such a way that D(i) andD(i) have the same distribution, and so that theD(i) variables are mutually independent. In order to establish asymptotic independence of the original D(i) variables, we then prove that In order to accomplish this, for each X e that shows up in more than one variable from (D i ) i2L , we construct |e \ L| 1 other independent copies of this random variable X e , denoted by X (2) e , X e , . . . , X (s) e where s = |e \ L|. Define X (1) e = X e for notational convenience. We then define a new variableD(i) for each i 2 L, which is the same as D(i), except we carefully replace the X e variables with the independent copies of X e , in order to ensure theD(i) variables are independent. We begin by defininĝ Then, to defineD(j) for j > 1, we start with the same sum of random variables that defines D(j), but then we replace each X e in this sum by the copy X , where r is the number of times that X e appeared in a sum D(i) for i < j. Thus, if X e shows up s times in the summations defining (D i ) i2L , we have constructed s independent copies of X e and used these in defining (D i ) i2L . As a result,D(i) and D(i) have the same distribution, but the variables (D i ) i2L are independent.
What remains is to prove that the probability that (D i ) i2L and (D i ) i2L are not equal goes to zero. These random variables will be the same if for every k-tuple e with s = |e \ L| 2, the variables X e all coincide is the probability that they all equal one or all equal zero, so: Pr For s 2 {2, 3, . . . , k}, let K s denote the set of k-tuples in with exactly s nodes from L. We can use Boole's inequality to bound the probability that (D i ) i2L and (D i ) i2L are not equal: Pr(X (1) e , . . . , X (s) e do not coincide) Finally, ifp = O(n ↵ ) for some integer ↵, we know thatp  C n ↵ for some constant C, so we have the following asymptotic result: where we have used the fact that (n ↵ C) s = n ↵s Cn ↵(s 1) + f (n) where f (n) represents lower order terms in n. Thus, as long as ↵ k 1 > k s, we see that this entire expression goes to zero for every s 2 {2, 3, . . . , k}, and so we have our desired asymptotic result:

Conditional Distribution of Type-t Degrees
Assume we observe a two-class k-uniform hypergraph for which the degree of node a 2 A is given by d(a). Given our fixed hyperedge-probabilities p j for each edge type, we can prove that for a fixed t, the conditional random variable where f t is the affinity parameter: This holds specifically for the parameter regime where T j is large but p j · T j is constant for all j 2 Under the assumed conditions on p j and T j , these binomial distributions are asymptotically equivalent to Poisson distributions: A sum of binomials is also asymptotically equivalent to a Poisson distribution with the same mean [32], so we also have that Similarly, By our assumption that p j T j = O(1), the right hand sides of (12), (13), and (14) are all constants. Therefore, asymptotically the distribution of (D t (a)|D a = d(a)) is a binomial with parameter f t , since and therefore lim n!1

Affinity Index as Maximum Likelihood Estimate
Given observed degree data (d(a), d t (a)|a 2 A) for a two-class, k-uniform hypergraph, we can model the type-t degree for an arbitrary node in A using the binomial distribution given in (10). For this data, the type-t affinity score will equal the maximum likelihood estimate for the affinity parameter f t . To show this result, defined t (a) = d(a) d t (a), i.e., the part of the degree that does not come from type-t hyperedges.
For simplicity and without loss of generality, denote the nodes in A by the indices 1, 2, . . . , |A|. Treating degrees as independent, the likelihood function for observing the given degrees for a given parameter f t is The log-likelihood function is Taking the derivative with respect to the parameter f t and setting it equal to zero, we find that the loglikelihood is maximized then f t is exactly equal to the type-t affinity index.

Baseline Scores and Proofs of Propositions 1 and 2
In order to determine the meaningfulness of a type-t affinity score, we compare it against a baseline score representing a null probability for participation in type-t hyperedges. For class X, the standard type-t baseline score b t (X) measures the probability that a node v 2 X forms a type-(X, t) hyperedge (i.e., there are exactly t nodes from v's class in the hyperedge) if it selects k 1 other nodes from V uniformly at random. Formally, Comparing the type-t affinity h t (X) against b t (X) generalizes a standard approach for checking for homophily in graphs. When k = t = 2, the hypergraph affinity score h t (X) equals the homophily index of a graph, and the type-2 baseline score is As n ! 1, this converges to the class proportion |X|/n, which is typically used as the standard baseline for the homophily index of a graph. In addition to generalizing the baseline for a graph homophily index, our hypergraph baseline scores satisfy the following intuitive interpretation, as given in the main text.
Proposition 1. Let H ⇤ k,n = (V, E) be the complete k-uniform hypergraph on n nodes. The type-t affinity score for class X ✓ V equal the type-t baseline score in (15).
k t , the total number of ways to choose t nodes from X and k t nodes that are not in X. The type-t affinity score for class X is therefore i 1 . This is the same as b t (X) -the numerator is identical, and the denominator is an alternative way to list all the possible ways to select k 1 nodes from a set of n 1 nodes, by separately counting how many of each type of hyperedge could be formed.
To provide further intuition for the baseline scores, we observe that the type-t baseline score is also related the type-t hypergraph affinity for a hypergraph obtained by generating hyperedges at random without regard to node class. Consider a cardinality-based HSBM where for some p 2 (0, 1), p i = p for all i 2 {0, 1, 2, . . . , k}. If M t is a random variable representing the number of type-t hyperedges, then If we replace m t with this expected value E[M t ] in the definition for h t (A) given in equation (3), then we exactly recover the type-t baseline score for class A: An analogous result also holds for baselines scores of class B. Proposition 2 from the main text is an even stronger result, showing that when hyperedges are generated at random without regard for node labels, the ratio scores of the resulting hypergraph converge to one. Proposition 2. Fix any p 2 (0, 1) and a positive integer k, and let H = (V, E) be a random hypergraph on n nodes that is formed by turning each k-tuple of nodes in V into a hyperedge with probability p. As n ! 1, the ratio scores for a class X ✓ V with |X| = ⇥(n) converge in probability to 1.
Proof. Our goal is to show that for every " > 0 and > 0, there exists some n 0 2 N such that for all n > n 0 and for all t 2 [k], with probability at least 1 , For i 2 [k], let N i be the number of k-tuples with exactly i nodes from class X, and M i be the expected number of hyperedges of type-(X, i) (exactly i nodes from class X) in H. The random variable M i is binomially distributed, M i ⇠ Bin(p, N i ), and has the following expected value and variance: As long as M i > 0 for some i 2 [k], the type-t affinity score for H is well defined and equals From (17) we know that replacing M i with m i in (18) yields the type-t baseline score for H: To prove that h t (X)/b t (X) converges to one, we first prove several facts about the limiting behavior of M i /m i . By Chebyshev's inequality, we know that for any i 2 [k] and any " > 0, Since N i = O(n k ) for every i 2 [k], this establishes that as long as n is large enough, M i /m i can be made arbitrarily close to one with high probability. With this in mind, fix " > 0 and > 0, and choose n 0 so that for all n > n 0 , with probability at least 1 for all i 2 [k], where" < min 1 2 , " 4 . This implies that M i > 0 and (1 ") < M i m i < (1 +"), and therefore we also know that For our final step of the proof we will make use of the following useful inequality, that holds for two sets of positive numbers {a 1 , a 2 , . . . , a`} and {b 1 , b 2 , . . . , b`} and an arbitrary integer`: Using (20), (21), and (22), we know that with probability at least 1 (for some integers j and`) <" 1 " +" 1 " < 4" < ".

Hypergraph Homophily Impossibility Results
We restate our main results for hypergraph homophily, exactly as given in the main text.  • If k is odd, it is impossible for both classes to simultaneously exhibit majority homophily • If k is even, it is impossible for both classes to exhibit majority homophily if additionally h k/2 (X) > b k/2 (X) for one of the classes X 2 {A, B}.
In the main text we include a full proof of Theorem 3. Theorem 4 is nearly identical and relies on simply adding one extra constraint and repeating the same basic steps. Here we provide full details for our majority homophily result for odd k, and show how they can be altered to yield the result for even k. For clarity and ease of presentation, we include many of the same steps as given in the main text, in some cases with expanded explanations.
Throughout the section, H = (V, E) denotes a hypergraph with two node classes {A, B} and hyperedges of a fixed size k. For t 2 {0, 1, 2, . . . , k}, m t represents the number of hyperedges in H where exactly t out of the k nodes in the hyperedge come from class A. As before, h t (A) and h t (B) denote the type-t affinity scores for classes A and B respectively, and can both be expressed in terms of absolute hyperedge counts: Our results apply to a generalized notion of baseline scores.
Definition. We will refer to the set of baseline scores {b t (X) : t 2 [k], X 2 {A, B}} as realizable or generalized baseline scores if they satisfy the following two assumptions: • There exists some two-class, k-uniform hypergraph G such that for each t 2 {1, 2, . . . , k}, b t (A) and b t (B) are the type-t affinity scores for classes A and B in G.
As long as min{|A|, |B|} k, the standard baseline scores satisfy the above definition, by Proposition 1. We now recall the definition of majority homophily presented in the main text.

Impossibility Result for Odd k
Recall from the main text that when k is odd, requiring both classes to exhibit majority homophily induces a constraint on each hyperedge count m t for t 2 {0, 1, 2, . . . k}. This is due to the fact that type-(A, t) hyperedges (t nodes from class A) are also type-(B, k t) hyperedges (k t nodes from class B). Applying a few steps of algebra, we can show that If there existed a type j such that m j were not bounded below, we could set m j = 0 and instead make all other hyperedge counts higher than would be expected at random, and in doing so make most hyperedge types overexpressed relative to the baseline. We will show that this is not possible when m t is lower bounded for all t 2 {0, 1, 2, . . . , k} as shown above. However, it is not immediately clear why lower bounds for each hyperedge type cannot all be satisfied simultaneously, especially given that half of the constraints depend on baseline scores for class A, while the other bounds depend on baseline scores for class B, which might be very different. Proposition 6 in the main text highlights one key challenge in proving that majority homophily cannot be satisfied by two classes at once. Proposition 6. If any inequality from (25) and (27) is discarded, it is possible to construct a two-class k-uniform hypergraph satisfying the remaining inequalities.
To prove this result, it will be convenient to consider the linear program (LP) that we will use to prove our impossibility results for majority homophily. A proof of Proposition 6 will follow by considering what happens in the case of the LP obtained by removing one constraint.
Linear program for measuring homophily The following linear program encodes the maximum amount of homophily that can be satisfied by classes A and B simultaneously.
Recall from the main text that there is a variable x t 0 for each type of hyperedge in some k-uniform hypergraph. More specifically, the constraint P k i=1 x i = 1 encodes the fact that x t represents the proportion of hyperedges that are of type-t (i.e., t out of k nodes are from class A). The constraint can be rearranged into the inequality: This constrains the hypergraph type-t affinity score for class A to be larger than its baseline score by at least an additive term / P k i=1 i · x i , which will be positive if and only if is positive. The second set of constraints encodes similar bounds for the affinity scores of class B. A feasible solution with = 0 can always be achieved if the x i variables represent hyperedge counts for a hypergraph whose affinity scores are equal to the generalized baseline scores. The following lemma shows that if the constraints are satisfied for some > 0, this means there exists a two-class k-uniform hypergraph where both classes exhibit majority homophily.
Lemma 7. Let ⇤ be the optimal solution to the linear program in (28). There exists a two-class k-uniform hypergraph H with both classes exhibiting majority homophily if and only if ⇤ > 0.
Proof. Given a hypergraph where both classes satisfy majority homophily, let x i = m i /M , where M is the total number of hyperedges and m i is the number of type-i hyperedges. The type-t affinity for class A is given by A similar expression in terms of x i variables can be shown for affinity scores for class B. Choose the maximum value of so that all constraints are still satisfied. Since both classes are assumed to satisfy majority homophily, this will be strictly greater than zero. If on the other hand we assume that ⇤ > 0, the variable x i will represent the proportion of type-i hyperedges in some two-class hypergraph where both classes exhibit majority homophily. All coefficients in the LP are rational, so its solution will be rational as well. We can therefore scale the x i variables by a common denominator C so that the value M i = Cx i is an integer. To construct the appropriate hypergraph, generate M i hyperedges of type-i for each i 2 {0, 1, 2, . . . , k}. This can always be done by generating hyperedges that are completely disjoint. If one desires a specific balance between the number of nodes in classes A and B, isolated nodes from either class can be added. The resulting hypergraph provides the desired example.
Proof of Proposition 6 Observe that Proposition 6 is equivalent to stating that if we alter the above LP by removing any one of the constraints of the form or any constraint of the form then the optimal solution to the resulting LP will be strictly greater than zero. We now show why this is true. Let r = (k + 1)/2. If = 0, these constraints are all satisfied tightly by settingx i = m i /M , where m i is the number of type-i hyperedges and M is the total number of hyperedges, in the hypergraph whose affinity scores equal the given generalized baseline scores. Define We then have Satisfying the LP constraints for some > 0 is equivalent to satisfying the following set of strict inequalities, one for each of the x i variables: If we remove the constraint associated with variable x t for some t 2 {1, 2, . . . , k 1}, we can satisfy the remaining inequalities strictly by setting x t = 0 and keeping all other variables the same: In this case, the right hand side of the equalities in (30) and (31) will strictly decrease, but the left hand side will not change. This set of variables must afterwards be normalized to sum to one, to ensure feasibility for the LP, but this does not change the fact that all inequalities (except the one we discarded) are satisfied strictly. If we remove the constraint associated with x 0 or x k , the proof is similar, but is slightly more involved since the first r inequalities do not involve x 0 and the second set of r inequalities do not involve x k . We will prove the result when discarding the inequality that lower bounds x 0 . By symmetry, the result holds in the same way if we removed the inequality for x k .
After removing the inequality for x 0 , consider " > 0 and the following set of new variables: For this new set of variables, we have Also, Observe that the first set of r constraints, which are associated with class A and variables x k to x r , are satisfied strictly with the new set of variables. For t 2 {k, k 1, . . . , r}, we have: where we have used the fact that b t (A) < 1. Finally, we must simply choose any " > 0 small enough that the set of inequalities for the variables x r 1 , x r 2 , . . . , x 2 , x 1 are also satisfied strictly, which is still possible since we set x 0 = 0. In particular, for t 2 {r 1, r 2, . . . , 2, 1}, we must satisfy Substituting in the definition ofx i , this is equivalent to Using a few steps of algebra, we can see that this is true as long as The left side of the above expression is always positive. If the right side is negative for the given set of generalized baseline scores, the inequality is trivial. If the right side is positive, there exists some " > 0 that is small enough to ensure the inequality holds strictly. Therefore, if we remove the LP constraint that lower bounds x 0 , there exists a set of variables whose objective score is strictly positive. This in turn implies that we can remove one inequality from (23) and (24) and find a hypergraph that satisfies all of the others. This therefore proves Proposition 6.
Primal-Dual LP Formulation. We now turn our attention back to the original linear program in (28) that includes all constraints. In order to prove results about optimal solutions this LP, we first re-write it in a general form using matrix notation and compute its dual. Let x = ⇥ x 0 x 1 · · · x k ⇤ and e be the all ones vector, so the constraint For odd k, let r = (k + 1)/2. Later we will show how to make adjustments to the LP when proving impossibility results for even k. We construct a matrix B so that the set of constraints is encoded by In order to write the constraints in this way, we carefully order the constraints in (32) and (33) based on our ordering of x i variables in x. The first r rows of B correspond to constraints in (32), starting with t = k and decreasing t until t = r. The second set of r rows in B corresponds to constraints (33), starting with t = r and increasing until t = k. Applying standard techniques for computing the dual of a linear program, the LP from (28) and its dual linear program are then given by When considering optimal variables for the dual LP, it will be convenient to work with a decomposed form of the matrix B. As an example, when k = 3, the matrix B is given by This matrix can be decomposed as follows: In general for odd k, we can decompose the matrix B in the following way: where D k is a diagonal matrix with diagonal entries [k, k 1, · · · , r, r + 1, r + 1, r, · · · , k 1, k], and D b is a diagonal matrix with diagonal entries The first r rows of matrix R are ⇥ k k 1 · · · 1 0 ⇤ , and the next r rows are ⇥ 0 1 · · · k 1 k ⇤ : .
We can quickly obtain a solution with objective score of ↵ = 0 for the primal linear program. Recall that by assumption, the baseline scores correspond to affinity scores for some k-uniform hypergraph G. Let m t be the number of hypergraphs of type t in G, and M be the total number of hyperedges. Then define a set of primal solutions x by setting x t = m t /M for t 2 {0, 1, 2, . . . k}, and set = 0. The fact that the affinity score in this hypergraph equals the baseline score means that this set of primal variables is feasible for the primal LP. The following lemma, which proves our majority homophily result for odd k in Theorem 5, shows that these are in fact optimal primal solutions.
Lemma 8. For an odd integer k and r = (k + 1)/2, define = 2k P k t=r 1 t , and consider the following set of dual variables: P k If Y = t=r y A,t + y B,t , then the set of normalized dual variables defined by ỹ X,t = y X,t /Y for X 2 {A, B} and t 2{r, . . . , k} is feasible for the dual LP for majority homophily. Proof. When considering variables for the dual LP (35), first recall that the matrix B can be decomposed into the form B = D k D b R, where D b is a diagonal matrix with diagonal entries In this way, each row and column of B can be mapped to a pair (X, t) where t 2 {r, r +1, . . . , k} represents a hyperedge type and X 2 {A, B} is a class. Therefore, each dual variable is also associated with an (X, t) pair, which is why we doubly-index dual variables in y as follows: In the remainder of the proof, we will show that the unnormalized dual variables given in the lemma statement are nonnegative and satisfy B T y = 0. In their current form, these variables do not satisfy e T y = 1, but this can easily be fixed by dividing the entries of y by their sum to produce a vectorŷ whose entries sum to 1. At this point, the vectorŷ along with ↵ = 0 provides a feasible solution with objective score of zero for the dual LP, which will conclude the proof. Thus, in the remainder of the proof we prove that the variables in (40), (41), and (42) are nonnegative and satisfy B T y = 0.
Nonnegativity of dual variables. The nonnegativity of dual variables follows from the nonnegativity of baseline scores. Note in particular that which shows that the denominator of y B,k is positive. The numerator is also positive by inspection. Since y B,k > 0, we can see that all three terms in y B,t are positive. Finally, As before, the numerator and denominator are both positive, so y A,t > 0. Proving B T y = 0. Given the decomposition B = D k D b R, we can see that proving B T y = 0 is equivalent to showing If we doubly index entries of D k y using the same indexing as the y entries, we see that Meanwhile, the right hand side of (43) is  After canceling k from both sides, the first row of the matrix equation (43) is: .
Therefore, the equivalence between the first r entries in the equation (43) will hold as long as we can prove that the following are both equivalent ways of writing y = y B,k : We can prove this by using the fact that there is a hypergraph G whose affinity scores equal the baseline scores. More specifically, for Re-writing the numerator and denominator of the first ratio in (44) (and after scaling each expression by /2) we get .
Similarly, we re-write the second ratio: Written this way, we see that the numerators are the same since It remains to show that the denominators are equal, which we prove by considering a sequence of equivalent statements: At this point our proof has shown that the first r rows (the rows corresponding to class B), of the matrix equation D k y = R T D b y hold. We use a similar approach to show the remaining r rows also hold. First note that for t 2 {r, r + 1, . . . , k}, The last entry of the equation is [D k y] A,k = [R T D b y] A,k , which holds by the following sequence of equivalent statements: The last equation holds by the equivalent ways of writing y B,k shown in (44).
Finally, we confirm that [D k y] A,t = [R T D b y] A,t holds for t 2 {r, . . . , k 1}. Let y = y B,k and recall from the last step that 2/ y = P k i=r y A,i b i (A). Each equation corresponding to one of the last r rows has the form which again was shown in previous steps. At this point we have shown that all entries in the equation D k y = R T D b y hold. Therefore, the dual variables are nonnegative and satisfy B T y = 0, concluding the proof.

Proof for even k
When k is even, the definition of majority homophily does not place any restriction on type-(k/2) hyperedges for either class, since neither class is strictly in the majority for these hyperedges. If the number of type-(k/2) hyperedges is small enough, it is possible for both classes to exhibit majority homophily. As one example, starting with a complete hypergraph and deleting all hyperedges of type-(k/2) will produce a hypergraph where both classes satisfy majority homophily. However, we can still prove an analogous impossibility result for even k if we add one extra constraint. We restate and prove our result for even k. Proof. The proof follows the same steps as the proof for odd k, with minor alterations to the linear program and the optimal dual variables. We highlight key changes that must be made and for brevity skip steps that are nearly identical to the proof of the previous result. Let`= k/2. Without loss of generality we prove the result is impossible if we restrict h`(A) > b`(A). By symmetry, the same impossibility result holds if we added the new constraint for class B instead. We begin by altering the LP from (28) to include an additional constraint: For this new linear program, we can again confirm that the optimal score ⇤ will be greater than zero if and only if it is possible for both classes to exhibit majority homophily and for constraint (46) to hold. We then can again re-write the LP and its dual in the form shown in (34) and (35), by extending the matrix B to include one extra row to account for the new constraint. By our assumptions about baseline scores, we know there exists a hypergraph G, with M total hyperedges and m i hyperedges of type-i, such that the affinity scores of G equal the baseline scores in question. A primal feasible solution with an objective score of zero can then be realized by setting x i = m i /M and = 0.
Next, let r = k/2 + 1 and construct the following set of dual variables: The result again relies on the fact that y B,k can be written in two ways, using the fact that baseline scores correspond to affinity scores for some hypergraph G: .
Using the same basic set of steps used in Lemma 8, we can show that B T y = 0 and y 0. Scaling the variables to sum to one produces a dual feasible solution with an objective score of zero, which proves the result.

Impossibility Results for Normalized Bias Scores
One alternative approach to measuring an affinity score's deviation from baseline is to consider the normalized bias score introduced in the main text. For a class X, the type-t normalized bias score is Our existing notion of strict majority homophily is equivalent to requiring f t (X) > 0 whenever t > k/2. We see therefore that the same impossibility results and combinatorial limits apply to a natural notion of majority homophily for normalized bias scores. The natural way to define strict monotonic homophily for normalized bias scores is to require f t (X) > f t 1 (X) whenever t > k/2, which can be different from monotonicity of ratio scores. Nevertheless, in our empirical results we find that ratio scores and normalized bias scores often increase and decrease in similar patterns. Furthermore, we can prove the same type of impossibility results for normalized bias scores by adding one natural assumption regarding the balance in homophily levels exhibited by two node classes. Proof. If we consider first of all the case where f r (A) > 0 and f r (B) > 0, assuming that normalized bias scores are strictly increasing for both classes implies that f t (X) > 0 whenever t > k/2 for each X 2 {A, B}. This would mean that both classes satisfy strict majority homophily, which is impossible. If on the other hand we have f r (A)  0 and f r (B)  0, then f r (X) > f r 1 (X) means that Assuming that this holds for both classes at once is again a contradiction, as shown in the proof of Theorem 3.
We can similarly prove impossibility results for even values of k by adding an additional assumption as we did in Theorem 4. The assumption that f r (A) and f r (B) share the same sign is in line with the goal of trying to understand whether two classes can satisfy the same homophily properties at the same time. We conjecture that Theorem 9 holds without this assumption, though we leave a more in depth analysis for future work. In any case, this theorem confirms that even if there is some way for both classes to have strictly increasing normalized bias scores (which there may not be), there will still be a fundamental imbalance in their affinity scores and in the level of homophily they exhibit. This further confirms the message that natural notions of group homophily are governed by subtle combinatorial limits that must exist independent of human preferences and choices.

Derivation and Results for Alternative Affinity Scores
The hypergraph affinity scores we consider in the main text and the previous two sections of the supplement are based on ratios of typed degrees for nodes in a certain class. This directly generalizes the standard approach that has been used for measuring homophily in graphs [16]. Another natural approach is to consider affinity scores defined by ratios of hyperedge types. Formally, given the same k-uniform hypergraph H = (V, E) where m t denotes the number of type-t hyperedges, we can measure the following alternative affinity scores: For each class X 2 {A, B}, these ratios directly measure the proportion of hyperedges of type-(X, t), among all hyperedges involving at least one class X node. In this section we show that all of our main theoretical results also hold for these alternative scores. This first of all highlights that our main results on hypergraph homophily are broadly true for a wide range of notions of affinity scores. Furthermore, as we shall also see, our main impossibility results are in fact easier to show for these alternative scores, and our proof that these scores correspond to maximum likelihood estimates of a certain model parameter is more direct and does not require approximations.

Baseline Scores for Alternative Affinities
Analogous to our approach for standard affinity scores, we define the baseline score for a t (X) to be the probability that we obtain a type-(X, t) hyperedge if we select a k-tuple uniformly at random from among all k-tuples involving at least one X node: The denominator counts all k-tuples involving at least one node from X. For simplicity, we use the same notation as we did for standard affinity scores. The proof of this lemma is omitted as it follows the same steps as Lemma 1. We will prove homophily impossibility results for the following more general notion of baseline scores.
Definition. Let k be a fixed constant. The set of scores {b t (X) : X 2 {A, B}, t 2 [k]} are generalized baseline scores for alternative affinity scores {a t (X) : X 2 {A, B}, t 2 [k]} if the following two conditions hold: • The scores {b t (X)} correspond to alternative affinity scores for some k-uniform hypergraph G with two node classes.

Interpreting Scores as Maximum Likelihood Estimates
We can interpret the alternative affinity score a t (A) as the maximum likelihood estimate for a certain affinity parameter of a binomial distribution for hyperedge data. An analogous interpretation also applies for class B.
We begin by considering a slight variation of the cardinality-based HSBM. This new model still considers a set of n nodes separated into two classes {A, B}, and generates typed hyperedges based on a set of probabilities p = ⇥ p 0 p 1 · · · p k ⇤ . Let K t be the set of k-tuples of type-t. For each e 2 K t , let X e be a Poisson random variable with parameter p t , and let this represent the number of hyperedges placed at e: Recall that the cardinality-based HSBM differs in that we instead defined X e ⇠ Bernoulli(p t ). When we consider p t = o(1), which will typically be the case, this Poisson distribution will be very close to a Bernoulli with parameter p t . For a hypergraph generated from this distribution, let M j be the random variable representing the number of hyperedges of type-j in H, which as a sum of Poisson random variables will also be Poisson: Define M A to be the random variable denoting the total number of hyperedges involving at least one node in A, which is also Poisson distributed: The random variable representing the number of hyperedges that are not type-j is given bỹ Consider now a fixed hypergraph H with m A hyperedges involving at least one class-A node. If we assume this hypergraph was drawn from the random distribution given above, the random variable M t , conditioned on the observed hyperedge count m A , will be binomial: where f t is an affinity parameter: To see why, note Finally, given an observed hypergraph with m A hyperedges involving at least one node in A, the likelihood of observing m t hyperedges of type-t under this model is Taking a derivative of the log-likelihood function with respect to the parameter f t and setting it to zero will give the maximum likelihood estimate for f t . This ends up being equal to the alternative type-t affinity score,

Impossibility Results
Analogous to our results for standard affinity scores, we define two notions of hypergraph homophily based on alternative baseline scores. Let {b t (X) : X 2 {A, B}, t 2 [k]} denote a set of generalized baseline scores.
Definition. Class X 2 {A, B} exhibits majority homophily if for all t > k t, Definition. Class X 2 {A, B} exhibits monotonic homophily if for all t > k t, The proof of the following impossibility result for monotonic homophily follows the same steps as the proof for Theorem 3. In particular, these results can be shown by considering only two types of hyperedges.
Theorem 11. Let H be a two-class, k-uniform hypergraph and {b t (X)} be a set of generalized baseline scores for alternative affinity scores {a t (X)}. If k is odd, it is impossible for both classes to exhibit monotonic homophily in terms of alternative affinity scores. If k is even, then both classes can exhibit monotonic homophily, but in this case a`(X) b`(X) < a` 1 (X) b` 1 (X) for`= k/2 and X 2 {A, B}.
In order to prove impossibility results for majority homophily, we use the same linear programming based proof technique that we used for Theorem 5.

Theorem 12.
Let H be a two-class, k-uniform hypergraph and {b t (X)} be a set of generalized baseline scores for alternative affinity scores {a t (X)}. If k is odd, it is impossible for both classes to exhibit majority homophily in terms of alternative affinity scores.
Proof. Let r = (k + 1)/2. The following linear program will have a strictly positive solution if and only if it is possible for both classes to exhibit majority homophily, with respect to the new alternative affinity scores: The variable x t again represents the proportion of hyperedges where t of the nodes are from class A. We can express the linear program as well as its dual in the same general form Above, e is the all ones vector and the matrix B encodes constraints of the form x k i for t = k, k 1, k 2, . . . , r (62) x i for t = r, r + 1, . . . , k.
The matrix B can be decomposed into the following form: where I is the 2r ⇥ 2r identity matrix, D b is a diagonal matrix with diagonal entries and E is a matrix that is all ones except for zeros in the last column of the first r rows, and the first column in the last r rows. Formally, The zero entries reflect the fact that hyperedges of type zero do not affect the affinity scores for class A, and type-k hyperedges do not affect affinity scores for class B.
A primal solution with an objective score of 0 can be obtained by setting x j = m j /( P k i=0 m j ), where m i is the number of hyperedges of type i in the hypergraph whose affinity scores are equal to the baseline scores {b i (X)}. In order to find a set of dual variables with an objective score of zero, it suffices to find a vector y that is strictly positive and satisfies B T y. Equivalently, given the decomposition of B in (64), we want y to satisfy Each row and column of B can be associated with a class X and hyperedge type t, so we doubly index the dual variables as follows Rows 2 through 2r 1 of E T have the value 1 in every column, so for this equation to hold we must have for i 2 {r, r + 1, · · · , k 1} and X 2 {A, B}. Thus, a necessary condition for satisfying (65) is that entries 2 through 2r 1 of y are all equal. With this in mind, let z > 0 be a fixed positive value and construct a set of dual variables as follows: .
As long as z > 0, we can see that y B,k and y A,k will also be strictly positive. We can also set z so that the entries of y sum to one. As long as we can show B T y = 0, this means we have a set of dual variables with an objective score of zero, confirming that majority homophily cannot hold for both classes simultaneously. Variables y B,k and y A,k are explicitly chosen so that the first and last entries in equation (65) hold when all other variables are equal to a positive constant z. To check that B T y = 0, we just need to confirm that (66) holds. Recall that the baseline scores can be written in terms of hyperedge counts for some hypergraph G, i.e., for i 2 Substituting our choice of variables into the right hand side of (66), we confirm that the equation holds: An analogous impossibility result also holds for even k when we use alternative affinity scores.
Theorem 13. If k is even, it is impossible for both classes to exhibit monotonic homophily if additionally a`(X) > b`(X) for one class X 2 {A, B} when`= k/2.
Proof. The proof is nearly identical to the proof of Theorem 12. For even k, let r = k 2 + 1. We use the same linear program encoding the maximum possible amount of majority homophily, with an additional constraint for class A: The LP and its dual can again be written in the form (60) and (61). The matrix B is given by where I is the (k + 1) ⇥ (k + 1) identity matrix, D b is a diagonal matrix with diagonal entries and E is a matrix that is all ones except for zeros in the last column of the first k/2 rows, and zeros in the first column in the last k/2 + 1 rows. The dual variables of the linear program can be indexed as follows y T = ⇥ y B,k y B,k 1 . . . y B,r y A,`yA,r . . . y A,k 1 y A,k ⇤ .
The proof again follows as long as we can find a nonnegative vector y satisfying B T y = 0, or equivalently y = E T D b y. In order to accomplish this, all but the first and last dual variable can be set equal to some positive value z, and then we can solve for y A,k and y B,k : The only difference from the dual variables in Theorem 12 is that the summation in the numerator of y A,k starts from`rather than from r. The rest of the result follows by showing algebraically that B T y = 0.

Dataset Information and Additional Experimental Results
In this section we provide details regarding the datasets used in the main text and additional experimental results. Further information about each dataset is available from the original source. For all experiments in the main text and the supplement, we used asymptotic baseline scores when forming the ratio plots (see the Materials and Methods section). Code and data sufficient to reproduce all experimental results in the main text and in the supplement is available on Zenodo (https://doi.org/10.5281/zenodo. 7086798) as well as on GitHub (https://github.com/nveldt/HypergraphHomophily). In order to provide an additional comparison against graph-based approaches, we compute and report graph homophily indices obtained by projecting each hypergraph into a graph based on co-participation in group interactions. These graph homophily indices do provide some information regarding the level of homophily exhibited by each class of nodes in each dataset, but they do not allow us to capture any notion of majority or monotonic homophily. These graph scores also often hide the way in which group size affects the level of homophily that is exhibited by a class.
Co-authorship and gender The co-authorship dataset we considered is the DBLP Records and Entries for Key Computer Science Conferences [33], originally used in a study on women in computer science research [34] and available online at https://data.mendeley.com/datasets/3p9w84t5mr/1. The data constitutes 16 years of publications at top computer science conferences, from 2000 to 2015, listed on the DBLP bibliography database. We consider only authors in the dataset whose gender is known with high confidence, and discard all papers including an author of unknown gender. The resulting hypergraph has 105256 nodes (82620 male, 22636 female), with a maximum hyperedge size of 21. The number of hyperedges between size 2 and 4 is 74134.
We can project the hypergraph into a graph by including an edge between author i and j if they ever publish a paper together. If we perform this projection using all group interactions (i.e., all papers), the resulting graph has a homophily index of 0.270 for women and a homophily index of 0.821 for men. If we project only group interactions with 2-4 authors, the scores are similar: 0.261 and 0.828. In both cases, the scores are higher than the relative class proportions of 0.215 and 0.785.

Congress bills and political parties
The congress bills dataset is made up of legislative bills co-sponsored by US politicians in the Senate and House of Representatives. The original data was collected by James Fowler [24,26]. For our experiments we consider a derivation of the dataset presented as timestamped hyperedges [25], available online at https://www.cs.cornell.edu/˜arb/data/congress-bills. The hypergraph has 1718 nodes (810 Republican and 908 Democrat) and 83105 hyperedges ranging from size 2 to 25.    Figure S1: Results from running a bootstrapping procedure to check the robustness of empirical results. Ratio scores are shown for the DBLP co-authorship hypergraph (top row), the congress bills hypergraph (second row), the TripAdvisor hypergraph (third row), and the walmart hypergraph (last row). These results indicate that affinity and ratio scores in each case are robust to perturbations in the data. For each hypergraph and a range of values of k, the procedure samples from the original set of size-k hyperedges with replacement, and computes affinity scores and ratio scores each time. Solid lines in the plots indicate the true affinity scores computed on the entire dataset, which are nearly identical to the mean affinity score obtained from 100 runs of this procedure. Lighter colored regions show two standard errors above and below the mean. In many plots, the error region is too small to be perceptible.
If we project all group interactions to a graph, the graph homophily indices for Republicans and Democrats are 0.499 and 0.591, respectively, and their class proportions are 0.471 and 0.529. In other words, graph homophily indices are only slightly above baseline. If we only consider group interactions of size 20 or smaller (the largest group size we considered for our hypergraph affinity scores in the main text), the homophily indices increase only slightly to 0.516 and 0.606. These scores are obtained if we use an unweighted graph projection, where nodes i and j have a unit weight edge if they ever co-participate in a hyperedge. We can also perform a weighted projection where the weight of an edge between nodes i and j equals the number of hyperedges they both participate in. If we perform this weighted projection for all groups of size up to 20, graph homophily indices for Republicans and Democrats are 0.586 and 0.724 respectively. These higher Affinity type t Affinity / Baseline "Family Portraits" "Wedding+bride+ groom+portrait" "group shot" or "group photo" or "group portrait" Figure S2: Ratio scores obtained from a bootstrapping procedure on the group pictures dataset, indicating the robustness of our results for each group picture type. The solid lines indicate true affinity scores, which are very close to the mean affinity score from taking 100 different samples from the data. Lighter colored regions show one standard error above and below the mean from this bootstrapping procedure. On the subsampled datasets, we continue to see all the same major trends and differences between pictures types as we do when using all of the data.
scores indicate that it is common to see repeated collaborations between the same two individuals from the same political party, which is also a valid notion of homophily. This is arguably a more accurate measure of the high levels of homophily present in the dataset, though this still does not capture the extreme notions of majority and monotonic group homophily that we reveal using our hypergraph measures.
TripAdvisor hotels and locations The TripAdvisor hypergraph is derived from a review dataset originally used for research on opinion mining from online reviews [27], available online at https://www.cs. virginia.edu/˜hw5x/dataset.html. We associate each hotel in the dataset from North America or Europe as a labeled node in a hypergraph. We discard reviews from the original data that are cross-listed reviews from other travel sites, and reviews where the reviewer name is simply "A TripAdvisor Member". For each remaining unique reviewer, we construct a hyperedge joining all hotels they reviewed. The resulting hypergraph has 8956 nodes and 130570 hyperedges of size 2 to 87. The main text shows MaHI and MoHI scores for group sizes up to k = 13. If we project groups of size 2 to 13 to an unweighted graph, the graph homophily indices for North American and Europe are 0.829 and 0.519, respectively. If we use a weighted graph projection, graph homophily indices increase to 0.893 and 0.613. In both cases, graph homophily indices are well above baseline class proportions of 0.502 and 0.498. Table S1: Graph homophily indices obtained by projecting three different types of group picture datasets to graphs. We report indices obtained from projecting groups of size 2, 3, and 4, separately. We also provide graph homophily indices obtained when projecting multiple groups sizes at once to a graph.  [28] derived a large hypergraph of co-purchased products based on the data, and identified department labels for each product. This derived hypergraph is available at https://www.cs.cornell.edu/˜arb/data/walmart-trips/. In our work, we restrict to the subset of grocery and clothes products, resulting in a hypergraph with 48480 nodes (26178 grocery products, and 22302 clothes products) and 47034 hyperedges of size 2 to 25.
The main text shows MaHI and MoHI scores for group sizes up to k = 14. If we project groups of size 2 to 14 to an unweighted graph, the graph homophily indices for Groceries and Clothing are 0.942 and 0.618, respectively. If we use a weighted graph projection, these indices increase only slightly to 0.948 and 0.621. Baseline scores (i.e., class proportions) are 0.540 and 0.460 for Groceries and Clothing, respectively.
Group pictures and gender The three different group picture datasets [29] were downloaded from http: //chenlab.ece.cornell.edu/people/Andy/ImagesOfGroups.html. Images in the original dataset were obtained via three different Flickr search queries, producing sets of family pictures, wedding pictures, and general group pictures. We parse and store data for all group pictures with up to k = 10 people. The number of pictures with between two and four people for each type of group picture is 1051, 662, and 963 for family, wedding, and general group pictures respectively. We used even baseline scores for the experiments, which assumes an equal number of men and women. Although we do not have unique identifiers for people in the pictures, we checked that the data is indeed roughly gender balanced. Treating each person in each picture as unique, 51-52% of the subjects are female in each of the three group picture datasets.
In Figure 6 of the main text, the graph homophily indices we report are based on projecting all group pictures with up to 10 people to a graph based on co-appearance. For example, we reported graph homophily indices of 0.57 and 0.55 (for women and men, respectively) for wedding pictures. If we only project hyperedges of size 2 to 4 for wedding pictures, then the affinity scores are instead 0.45 for women and 0.40 for men, which are just below baseline values. Whether we project using all groups pictures or only small group pictures (k = 2, 3, 4), reducing group interactions to pairwise co-appearances overlooks meaningful information about the way in which homophily depends on the group size. One way to partially remedy this issue is to separately project different group sizes to different graphs, and then compute a graph homophily index for each separate graph. In Table S1, we report the graph homophily indices obtained by projecting size 2, 3, and 4 group pictures to a graph, as well as scores obtained from projecting all groups of size 2-4, and all groups of size 2-10. Columns ↵ W and ↵ M represent the class proportion (i.e., baseline scores) for women and men, respectively. This approach of separately projecting different group sizes is already a departure from standard graph-based techniques for measuring homophily, which often project all group sizes at once. These separate graph projections do provide one way to observe the way homophily depends on group size. In wedding pictures, for example, these separated scores accurately capture the fact that homophily is not present for groups of size k = 2, but starts to become more present for groups of size k = 3 and k = 4. However, separately projecting different group sizes still does not capture the fact that ratio scores meaningfully differ within each group size k depending on affinity type t  k. For example, our hypergraph scores capture the fact that in wedding group pictures with k = 4 people, type-3 affinity scores are below baseline while groups that are perfectly gender homogeneous or perfectly gender balanced are above baseline. Similarly, in family pictures with 4 people, our hypergraph measures capture the fact that type-2 affinities are above baseline, which intuitively reflects a high proportion of 4-person pictures of a husband, wife, one boy, and one girl. Graph homophily indices do not capture this observation.

Checking Robustness of Affinity Scores
By design, the affinity scores and the impossibility results we have considered apply to a specific hypergraph with a fixed set of hyperedges. In the main text, we therefore used all available hyperedge information when plotting results for each dataset we considered. Building a hypergraph from real data can be a noisy and imperfect process, and affinity scores will therefore vary depending on the quality of data and availability of group information in each context. We use a simple bootstrapping procedure to show that the basic patterns in affinity, ratio, and normalized bias scores for all of the hypergraphs we consider are stable to perturbations in the data. Given a hypergraph H with m k hyperedges of size k, we sample m k of these hyperedges with replacement and compute typed affinity, ratio, and normalized bias scores from the set of sampled hyperedges in each case. We repeat the process 100 times for each dataset, and then compute the mean and standard error for the resulting ratio scores. Figures S1 and S2 display bootstrapping results for ratio scores. The same observations about robustness apply to affinity scores and normalized bias scores as well. Results on paper co-authorship, congress bill co-sponsorship, TripAdvisor reviews, and Walmart shopping trips are particularly robust to perturbations in the data. Average scores when bootstrapping are nearly identical to scores obtain when using the entire hypergraph, and there is a very small standard error (Fig. S1). Scores for group pictures vary slightly more depending on the sample, but still preserve the same shape and pattern. We observe the same basic differences among pictures with regard to group type (wedding, family, or general group picture) and size (Fig. S2).

Experimental Results on Contact Data
In addition to the empirical results shown in the main text, we compute affinity scores with respect to gender for group gatherings among primary and high schools students [25,35,36]. Each hypergraph consists of groups of students (in primary school [36] and high school [35] respectively) that gathered together in close proximity at some point during the day, as measured by wearable sensors. In high school student interactions of size two through four (Fig. S3), students have high tendencies for gathering in groups where all members are of the same gender. All other affinity scores are below baseline. Results for primary school interactions are nearly the same (Fig. S4), except in the case of groups of size two, where groups with two male students are just slightly below baseline. We applied bootstrapping to both datasets (see Section 4.1 for details), to confirm that our results are robust to perturbations in the data. For both hypergraphs we observe a small standard error around the mean affinity scores obtained across different subsamples of the hyperedges.  Figure S3: Affinity, ratio, and normalized bias scores with respect to gender for a contact hypergraph at a high school. Nodes in the hypergraph are students, and hyperedges indicate sets of students who were gathered in close proximity at some point, as measured by wearable sensors. Solid lines indicate affinity scores for the entire hypergraph, and lighter colored regions show one standard error around the mean affinity score obtained from a bootstrapping procedure on the hyperedges, indicating our results are robust to perturbations in the data. Both genders exhibit strong simple homophily, meaning a high tendency for gathering in groups where everyone is of the same gender. All other scores are below baseline. For gatherings of size three and four, ratio scores almost increase monotonically, though perfect monotonic increase is impossible, as shown by our theoretical results. However, monotonic increase in raw affinity scores is possible, as seen in plots from the first row. This results from having a roughly balanced number of gatherings of each type. Affinity / Baseline Figure S4: Affinity, ratio, and normalized bias scores with respect to gender for a contact hypergraph at a primary school. Hyperedges indicate a set of students (nodes) who were gathered in close proximity at some point, as measured by wearable sensors. Solid lines again indicate affinity scores for the entire dataset, and lighter colored regions show one standard error around the mean affinity score from a bootstrapping. For groups of size three and four, both genders exhibit strong simple homophily, indicating a high tendency towards groups where everyone is of the same gender. For groups of size two, male students have affinity scores that are almost exactly equal to baseline.